Checkpoint and consistency markers

ABSTRACT

Described are a method, computer program product, and system for obtaining a copy of source data in a consistent state. One or more file operations having a corresponding time sequence which modify said source data are recorded. A request for a copy of the source data in a consistent state is received. It is determined at which point in the corresponding time sequence said source data is in a consistent state as a result of applying a portion of the file operations. The point in the corresponding time sequence at which the source data is in a consistent state is marked. The portion of file operations determined to place the source data in a consistent state is applied to the copy of the source data.

BACKGROUND

1. Technical Field

This application generally relates to a data storage system, and moreparticularly to techniques using consistency points and associatedconsistent copies of data in a data storage system.

2. Description of Related Art

Computer systems may include different resources used by one or morehost processors. Resources and host processors in a computer system maybe interconnected by one or more communication connections. Theseresources may include, for example, data storage devices such as thoseincluded in the data storage systems manufactured by EMC Corporation.These data storage systems may be coupled to one or more host processorsand provide storage services to each host processor. Multiple datastorage systems from one or more different vendors may be connected andmay provide common data storage for one or more host processors in acomputer system.

Different tasks may be performed in connection with a data storagesystem. For example, processing may be performed in a data storagesystem for creating and maintaining a mirror copy of data from a sourcesystem at a target system. As file operations are performed which causea data modification to the source system, the modifications may berecorded and then applied to the target system's copy of the data. Inconnection with performing data operations, it may be desirable toutilize a copy of the data on the target system when the data is in aconsistent state. The source or target system may be characterized asbeing in a consistent state at a point in time, for example, when alloutstanding database transactions are committed, any new incomingtransactions are placed on hold or “queued”, and any database buffersare flushed with respect to the selected point in time. However, it maybe difficult to establish and determine when the copy of the data on thetarget system is in such a consistent state.

Thus, it may be desirable to have an efficient technique for providing aconsistent copy of data on the target system and for determining whenthe copy of the data on the target system is in a consistent state whileminimizing any negative impact on the data of the source system andapplications accessing the data on the source system.

BRIEF DESCRIPTION OF THE DRAWINGS

Features and advantages of the present invention will become moreapparent from the following detailed description of exemplaryembodiments thereof taken in conjunction with the accompanying drawingsin which:

FIG. 1 is an example of an embodiment of a computer system that mayutilize the techniques described herein;

FIG. 2A is an example of an embodiment of a data storage system;

FIG. 2B is a representation of the logical internal communicationsbetween the directors and memory included in one embodiment of datastorage system of FIG. 2A;

FIG. 3 is an example of components that may be used in connection withperforming techniques herein for obtaining a consistent copy of data ona target;

FIG. 4 is an example of components that may be used in connection withreal-time replication of file operations in order to maintain a mirrorcopy of source data;

FIG. 5 is an example of components that may be included in a frameworkused in connection with inserting consistency point markers into astream of captured file operations;

FIG. 6 is an example of a stream of captured file operations;

FIG. 7 is an example of elements that may be included in a consistencypoint marker record;

FIG. 8 is a flowchart of processing steps that may be performed inconnection with determining when the source data is in a consistentstate;

FIG. 9 is a flowchart of processing steps that may be performed by asource system in connection with processing records corresponding tocaptured file operations and consistency point markers being forwardedto a target system;

FIG. 10 is flowchart of processing steps that may be performed by atarget system in connection with processing records corresponding tocaptured file operations and consistency point markers received from asource system;

FIGS. 11 and 12 are examples illustrating other configurations in whichthe techniques and components described herein may be utilized; and

FIG. 13 is an example illustrating in more detail a data storage systemconfiguration that may be used in connection with the techniquesdescribed herein.

DETAILED DESCRIPTION OF EMBODIMENT(S)

Referring now to FIG. 1, shown is an example of an embodiment of acomputer system that may be used in connection with performing thetechniques described herein. The computer system 10 includes a datastorage system 12 connected to host systems 14 a-14 n throughcommunication medium 18. In this embodiment of the computer system 10,and the N hosts 14 a-14 n may access the data storage system 12, forexample, in performing input/output (I/O) operations or data requests.The communication medium 18 may be any one or more of a variety ofnetworks or other type of communication connections as known to thoseskilled in the art. The communication medium 18 may be a networkconnection, bus, fabric, and/or other type of data link, such as ahardwire or other connections known in the art. For example, thecommunication medium 18 may be the Internet, an intranet, network orother wireless or other hardwired connection(s) by which the hostsystems 14 a-14 n may access and communicate with the data storagesystem 12, and may also communicate with others included in the computersystem 10.

Each of the host systems 14 a-14 n and the data storage system 12included in the computer system 10 may be connected to the communicationmedium 18 by any one of a variety of connections as may be provided andsupported in accordance with the type of communication medium 18. Theprocessors included in the host computer systems 14 a-14 n may be anyone of a variety of proprietary or commercially available single ormulti-processor system, such as an Intel-based processor, or other typeof commercially available processor able to support traffic inaccordance with each particular embodiment and application. It should benoted that the particular examples of the hardware and software that maybe included in the data storage system 12 are described herein in moredetail, and may vary with each particular embodiment. Each of the hostcomputers 14 a-14 n and data storage system may all be located at thesame physical site, or, alternatively, may also be located in differentphysical locations. Examples of the communication medium that may beused to provide the different types of connections between the hostcomputer systems and the data storage system of the computer system 10may use a variety of different communication protocols such as SCSI,Fibre Channel, iSCSI, and the like. Some or all of the connections bywhich the hosts, management component(s), and data storage system may beconnected to the communication medium may pass through othercommunication devices, such as a Connectrix or other switching equipmentthat may exist such as a phone line, a repeater, a multiplexer or even asatellite.

Each of the host computer systems may perform different types of dataoperations in accordance with different types of tasks. In theembodiment of FIG. 1, any one of the host computers 14 a-14 n may issuea data request to the data storage system 12 to perform a dataoperation. For example, an application executing on one of the hostcomputers 14 a-14 n may perform a read or write operation resulting inone or more data requests to the data storage system 12.

Referring now to FIG. 2A, shown is an example of an embodiment of thedata storage system 12 that may be included in the computer system 10 ofFIG. 1. Included in the data storage system 12 of FIG. 2A are one ormore data storage systems 20 a-20 n as may be manufactured by one ormore different vendors. Each of the data storage systems 20 a-20 n maybe inter-connected (not shown). Additionally, the data storage systemsmay also be connected to the host systems through any one or morecommunication connections 31 that may vary with each particularembodiment and device in accordance with the different protocols used ina particular embodiment. The type of communication connection used mayvary with certain system parameters and requirements, such as thoserelated to bandwidth and throughput required in accordance with a rateof I/O requests as may be issued by the host computer systems, forexample, to the data storage system 12. In this example as described inmore detail in following paragraphs, reference is made to the moredetailed view of element 20 a. It should be noted that a similar moredetailed description may also apply to any one or more of the otherelements, such as 20 n, but have been omitted for simplicity ofexplanation. It should also be noted that an embodiment may include datastorage systems from one or more vendors. Each of 20 a-20 n may beresources included in an embodiment of the computer system 10 of FIG. 1to provide storage services to, for example, host computer systems. Itshould be noted that the data storage system 12 may operate stand-alone,or may also included as part of a storage area network (SAN) thatincludes, for example, other components.

Each of the data storage systems, such as 20 a, may include a pluralityof disk devices or volumes, such as the arrangement 24 consisting of nrows of disks or volumes 24 a-24 n. In this arrangement, each row ofdisks or volumes may be connected to a disk adapter (“DA”) or directorresponsible for the backend management of operations to and from aportion of the disks or volumes 24. In the system 20 a, a single DA,such as 23 a, may be responsible for the management of a row of disks orvolumes, such as row 24 a.

The system 20 a may also include one or more host adapters (“HAs”) ordirectors 21 a-21 n. Each of these HAs may be used to managecommunications and data operations between one or more host systems andthe global memory. In an embodiment, the HA may be a Fibre ChannelAdapter or other adapter which facilitates host communication.

One or more internal logical communication paths may exist between theDA's, the RA's, the HA's, and the memory 26. An embodiment, for example,may use one or more internal busses and/or communication modules. Forexample, the global memory portion 25b may be used to facilitate datatransfers and other communications between the DA's, HA's and RA's in adata storage system. In one embodiment, the DAs 23 a-23 n may performdata operations using a cache that may be included in the global memory25 b, for example, in communications with other disk adapters ordirectors, and other components of the system 20 a. The other portion 25a is that portion of memory that may be used in connection with otherdesignations that may vary in accordance with each embodiment.

The particular data storage system as described in this embodiment, or aparticular device thereof, such as a disk, should not be construed as alimitation. Other types of commercially available data storage systems,as well as processors and hardware controlling access to theseparticular devices, may also be included in an embodiment.

Also shown in the storage system 20 a is an RA or remote adapter 40. TheRA may be hardware including a processor used to facilitatecommunication between data storage systems, such as between two of thesame or different types of data storage systems.

Host systems provide data and access control information throughchannels to the storage systems, and the storage systems may alsoprovide data to the host systems also through the channels. The hostsystems do not address the disk drives of the storage systems directly,but rather access to data may be provided to one or more host systemsfrom what the host systems view as a plurality of logical devices orlogical volumes (LVs). The LVs may or may not correspond to the actualdisk drives. For example, one or more LVs may reside on a singlephysical disk drive. Data in a single storage system may be accessed bymultiple hosts allowing the hosts to share the data residing therein.The HAs may be used in connection with communications between a datastorage system and a host system. The RAs may be used in facilitatingcommunications between two data storage systems. The DAs may be used inconnection with facilitating communications to the associated diskdrive(s) and LV(s) residing thereon.

The DA performs I/O operations on a disk drive. In the followingdescription, data residing on an LV may be accessed by the DA followinga data request in connection with I/O operations that other directorsoriginate.

Referring now to FIG. 2B, shown is a representation of the logicalinternal communications between the directors and memory included in adata storage system. Included in FIG. 2B is a plurality of directors 37a-37 n coupled to the memory 26. Each of the directors 37 a-37 nrepresents one of the HA's, RA's, or DA's that may be included in a datastorage system. In an embodiment disclosed herein, there may be up tosixteen directors coupled to the memory 26. Other embodiments may use ahigher or lower maximum number of directors that may vary.

The representation of FIG. 2B also includes an optional communicationmodule (CM) 38 that provides an alternative communication path betweenthe directors 37 a-37n. Each of the directors 37 a-37 n may be coupledto the CM 38 so that any one of the directors 37 a-37 n may send amessage and/or data to any other one of the directors 37 a-37 n withoutneeding to go through the memory 26. The CM 38 may be implemented usingconventional MUX/router technology where a sending one of the directors37 a-37 n provides an appropriate address to cause a message and/or datato be received by an intended receiving one of the directors 37 a-37 n.In addition, a sending one of the directors 37 a-37 n may be able tobroadcast a message to all of the other directors 37 a-37 n at the sametime.

Referring now to FIG. 3, shown is an example of components of the system10 that may be used in connection with performing techniques describedherein. The example 100 includes a host 14a and a data storage system 12as previously described in connection with other figures. The host 14aincludes a database application 102 executing thereon. The databaseapplication 102 may read and write data, for example, from a datastorage device 104 which may be local to the host 14 a. In connectionwith performing some operations such as, for example, a backupoperation, data from a device local to a host, such as device 104, maybe copied to another central location, such as a device 106 on datastorage system 12. Device 106 of data storage system 12 may then bebacked up onto other data storage devices. The database application 10may perform file-based commands, such as file operations to read, write,truncate, and the like, in connection with operating on files includedin the device 104. The backup operation may also be performed on data atthe file system level where the specified elements to be backed up are,for example, particular files, directories, and the like, in accordancewith the file system and structure of a particular embodiment. The datawhich is the subject of the backup operation may be a portion of dataincluded in a device 106.

As part of obtaining and maintaining a copy of source data from a sourcedevice, such as 104, an initial synchronization operation may beperformed. The synchronization operation may initially copy files of thesource data to a target, such as target data included in device 106.After synchronization, both the source and target data copies are thesame. Subsequent updates or modifications to the source data can then beasynchronously replicated on the target data by mirroring fileoperations performed on the source data. In other words, if a particularfile operation results in a modification or change to the source data,this file operation may be recorded and then also applied to the targetdata. When performing a backup operation upon the target data, thetarget data may be characterized as a consistent copy of the source dataat a point in time. As described herein, a copy of the source data is ina consistent state at a point in time, for example, when all outstandingdatabase transactions are committed, any new incoming transactions areplaced on hold or “queued”, and any database buffers or other cacheddata are flushed with respect to the selected point in time.

What will be described herein are techniques that may be used inconnection with obtaining a consistent copy of the source data fromdevice 104 on a target such as device 106 of the data storage system 12.The target data, at a point in time when it is a consistent copy of thesource data, may then be used in connection with performing otheroperations and tasks such as, for example, a backup operation.

Although the example 100 illustrates only a single host 14 a, it shouldbe noted that any one of the other hosts included in the system 10 mayalso be used in connection with the techniques described herein. Thehost 14 a, the particular application such as the database application102, and the use of the target data included on storage device 106 inconnection with performing a backup operation are selected for thepurposes of illustrating the techniques described herein. Otherapplications besides database applications may be used in connectionwith performing operations on a source data device being replicated to atarget data storage device. Additionally, the target data storage devicemay be used in connection with any one of a variety of differentoperations besides a backup operation.

The components included in FIG. 3 may only represent a portion of thoseactually included in an embodiment. The particular ones of FIG. 3 havebeen included for the purposes of illustration and example. Variousconfigurations are described elsewhere herein. For example, oneembodiment may use host-to-host communications, data storagesystem-to-host, or data storage system-to-data storage systemcommunications in connection with implementing the techniques describedherein.

In following paragraphs, what will first be described are techniquesthat may be used in replicating the source data on a target.Subsequently what will be described are additional elements to thereplicating technique to facilitate determination of when the targetdata may be characterized as a consistent copy of the source data.

What will now be described are techniques that may be used in connectionwith replicating source data, such as data from source device 104, to atarget device,,such as device 106. The replication techniques describedherein may be performed at the file system level in accordance with filesystem level commands. In accordance with one technique that will bedescribed in connection with FIG. 4, various file system level commandsmay be captured and stored as file operations are applied, in real time,to the source device 104. As these file operations are captured andrecorded in a queue or storage location on a source system, such as thehost 14 a, the captured operations may be forwarded to a target systemor server for application to target data included in a target device106.

It should be noted that although a particular replication techniqueusing asynchronous replication is described in more detail herein, itshould be noted that the techniques described herein may be used inconnection with other replication techniques known to those of ordinaryskill in the art such as, for example, various modes of synchronousreplication.

Referring now to FIG. 4, shown is an example 200 of components that maybe used in connection with real-time replication of file operations inorder to maintain a mirror copy of source data included in a sourcedevice 104 on a target device 106. Included in the example 200 are asource system 230 and a target system 240. The source system 230 may be,for example, the host 14 a as previously illustrated in connection withFIG. 3. The target system 240 may be, for example, a server included in,or connected to, the data storage system 12. The application 102 mayperform file operations such as, for example, file write operations orother operations causing modification to the source data. Duringexecution of the application 102, these file operations issued by theapplication 102 (step 1) may be captured using one or more components.Some of the components execute in kernel space 202. Components executingin kernel space 202 may include the I/O manager 204, a mirroring driver206, and the file system 208. It should be noted that an embodiment mayinclude other components than as illustrated in connection with thisexample. For example, an embodiment may include one or more otherdrivers below the file system 208 in connection with interfacing withthe disk or other device 104. The mirroring driver 206 may becharacterized as a filter driver which captures file operations ofinterest on the way back up the call chain (step 2) and records the fileoperations of interest in the kernel cache 210 (step 3). In the eventthat the kernel cache 210 is full or otherwise overflows, the mirroringdriver 206 may record the data operation captured in an overflowlocation 212 (step 4). A consumer or reader of the capture dataoperations included in the kernel cache and/or overflow location 212 isthe replication service 214, a user-mode process. The replicationservice 214 reads the captured data operations from the kernel cache 210(step 5) and the overflow location 212 (step 6). It should be noted thatthe mirroring driver 206 continues to write to the overflow location 212until the replication service processes a sufficient amount of capturedfile operations from the kernel cache. When this occurs, the mirroringdriver may then resume writing to the kernel cache 210. When themirroring driver queues a kernel operation as in the kernel cache 210 oroverflow location 212, the replication service 214 then reads the fileoperation from either the kernel cache or the overflow location 212. Thereplication service 214 transmits the file operation (step 7) as readfrom 210 and/or 212 to the appropriate target system such as server 240.Code located on the target server may include, for example, replicationservice 216 which processes the received file operations (step 8) astransmitted from service 214 of the source system 230. The replicationservice 216 on the target may then proceed to apply the received fileoperations (step 9) to the files of the target data included in thetarget storage device 106 and send an acknowledgment back to the sourcesystem (step 10). It should be noted that in the event that thereplication service 216 is blocked from performing or applying a fileoperation, the file operation may be recorded in a blocked datacontainer 220 (step 11). It should be noted that applying a fileoperation to a file may be blocked, for example, if a disk is full, afile is in use, and the like. Once a file is labeled as havingoperations blocked, all subsequent file operations to that particularfile may then be saved in the blocked data container 220 until the fileis otherwise associated with a state of unblocked file operations.Periodically, the replication service 216 may attempt to reapply thefile operations as included in the blocked data container 220 (step 12).

As just described, mirroring may be characterized as the real timemirroring of a file operation applied to the source data. The mirroringmay be facilitated, for example, using the mirroring driver 206 and theservice 214. Mirroring may be triggered by a file operation of anapplication such as a database application 102. The application may alsobe, for example, any other user-mode process such Microsoft Word. Thefile operation may also be caused by a remote client computer attachedto the source for example through a mapped drive. The operations justdescribed in connection with the example 200 of the FIG. 4 may becharacterized as having three distinct phases including mirroring inwhich the file operations are captured in the kernel and queued to besent to a target. A second phase may be characterized as forwarding inwhich the file operations are forwarded from the source to the target. Athird phase may be characterized as updating in which the fileoperations are applied to the target data. It should be noted that thesource system 230 and target system 240 may be located within the sameor different computer systems. In other words, 230 may be included in afirst computer system and 240 may be included in a second differentcomputer system. Alternatively elements 230 and 240 may be included inthe same computer system. Elements 230 and 240 may be physically locatedin close proximity or at geographically different locations. These andother variations of the example 200 are readily appreciated by one ofordinary skill in the art.

The mirroring driver 206 may capture and record file operations when afile operation has been completed as may be indicated, for example, bythe return status of each request as indicated by the arrows going upthe return call chain of the elements included in 202. Upon thecompletion of the file operation, the mirroring driver 206 makes adetermination as to whether it should capture the particular fileoperation. In this example, the file operation may be one of interest ifthe operation results in a modification to a file such as, for example,in connection with a file creation, a file write operation, a filetruncation operation, and the like. It should be noted that otherconditions may be associated with defining whether a particular fileoperation is of interest including, for example, the successful statusof a file operation. If the mirroring driver 206 determines that theparticular file operation is of interest in accordance with the one ormore criteria as may be specified in an embodiment, the data operationmay be captured and stored in the kernel cache 210. It should be notedthat in this example, the kernel cache may be characterized as an areaof shared memory acting as a queue for all mirrored file operations. Thekernel cache may be a fixed size. If the kernel cache is full, the fileoperation may then be recorded in the overflow area 212. In thisexample, the replication service 214 on the source and the replicationservice 216 on the target may be characterized as user mode applicationsand are not required to be executed in a privileged mode, such as kernelmode 202.

In connection with recording the file operations of interest, themirroring driver 206 produces what may be characterized as a stream offile operations to be applied to the target data included in targetdevice 106. The foregoing describes asynchronous data movement in whichthe file system changes do not need to be committed to the target beforeadditional data operations to the source data are allowed to continue.Rather, the source system is allowed to operate normally in parallel tothe data replication processing. The asynchronous replication used inthe foregoing allows for capturing any changes to source dataimmediately which are then cached locally on a source system. Thechanges are then forwarded to the target system or server as network andother resources allow.

What will now be described are techniques that may be used in connectionwith inserting additional consistency point markers in the foregoingstream of recorded file operations. The consistency point markersindicate various points at which a copy of the source data, as may bereplicated in the target data, is consistent. In other words, aconsistency point marker may be inserted into the stream of fileoperations to serve as a marking point such that if data operations upto that particular marker are applied to the target device, the targetdevice may be characterized as a consistent point in time copy of thesource data. In one embodiment as will be described herein withreference to the components of FIG. 4, a modified version of thereplication service 214 may also be used in connection with inserting aconsistency point marker into the stream or queue of the file operationsto be applied to the target copy of the data.

It should be noted that RepliStor is a commercially available product byEMC Corporation for use in connection with data replication andmirroring as described herein. The commercially available RepliStorproduct includes the replication services 214 and 216 and the mirroringdriver which operates as described, for example, in connection with theembodiment 200 of FIG. 4. What will now be described is one embodimentin which modified versions of those components as illustrated in FIG. 4may be used in connection with obtaining a consistent point in time copyof source data on a target. The copy of the source data as included in atarget may be used in connection with performing other operations aswill be appreciated by those of ordinary skill in the art.

Referring now to FIG. 5, shown is an example 300 of components that maybe included in the framework used in connection with insertingconsistency point markers into the stream of file operations captured ona source system. Included in the example 300 are Writers 302, Providers304, and Requesters 306. Additionally included in the example 300 is acomponent VSS 310. In this example, VSS 310 is the Volume Shadow CopyService (VSS) by Microsoft Corporation. VSS is a commercially availableproduct that may be characterized as providing a framework with anapplication programming interface (API) that may be used in connectionwith creating consistent point in time copies of data. As more generallyused, VSS allows system administrators to create snapshots or shadowcopies of a volume of files that may be shared as a network resource.VSS communicates with the different components that may be characterizedin different classes as Writers 302, Providers 304, and Requesters 306with respect to a particular dataset so that a point-in-time copy of thedata may be made. Writers 302 may be characterized as a first set ofcomponents which write to a copy of data, such as the source dataincluded in the source system 230 of FIG. 4. A writer may be, forexample, the database application 102 illustrated in connection withFIGS. 3 and 4, a Microsoft Exchange server, or other application.Generally, Writers 302 may perform updates or other types of operationscausing modifications to dataset. Providers 304 may be characterized assoftware and/or hardware components which provide the dataset such as,for example, the underlying storage system. Providers of the datasetsmay include, for example, data storage systems such as the Symmetrixdata storage system by EMC Corporation or other data storage systems.The Providers 304 generally provide the resource or dataset which may beaccessed by the Writers 302. Requesters 306 may be characterized asthose components making a request for obtaining a point-in-time copy. Ingeneral use, Requesters may include, for example, a backup softwareapplication in connection with making a request for performing a backupcopy of data as may be provided by one of the Providers 304. Inoperation, the Requester 306 may issue a request to the VSS component310 to obtain a point-in-time copy of a dataset. VSS 310 thencommunicates with the Writers 302 of that dataset to pause any newtransactions, finish any current transactions, and flush any cached datato disk. Once the Writers 302 have completed this set of operations withrespect to a requested dataset, the VSS component 310 communicates withthe appropriate Provider 304 to initiate a shadow copy process for therequested datasets. Once a shadow copy has been created, the backupsoftware (e.g., requester in this instance) can then copy data from thisshadow copy, for example, to a tape without involving the writers of theparticular dataset. Thus, VSS acts as a framework for facilitatingcommunications between Writers 302, Providers 304, and Requesters 306 inorder to obtain a point-in-time copy of data.

The foregoing framework may be used in connection with obtaining aconsistent point in time copy of the source data. In connection with thetechniques described herein, a host-based user mode process, such as thereplication service 214 of the example 200 of FIG. 4, may register asone of the Providers 304 of a dataset, such as the source data. Thus,using the framework of FIG. 4, the replication service 214 may registeras a provider and be notified by VSS 310 when a requested dataset (e.g.,the source data) is in a consistent state. A process may initiate arequest as one of the Requesters 306 using an application programminginterface (API) when generation of a consistency point marker isdesired. It should be noted that the process or component acting as theRequester and the Provider for the consistency point marker generationmay be the same process or component as well as different processes orcomponents. The API may be specified in accordance with the particularembodiment such as the VSS and other components.

As an example, a scheduler may be a process which executes in the sourcesystem and makes a request as a Requester 306 for the source data. Thereplication service 214 may be registered as a Provider 304 for thesource data. VSS 310 communicates with the database application and anyother Writers 302 of the source data to complete existing transactions,not start any new ones, and flush any cached source data so that thesource data is in a consistent state. VSS 310 then notifies theregistered Providers 304 of the source data which causes notification tobe sent to the service 214 that the source data is currently in aconsistent state. Any further operations to the source data are pausedfor a predetermined small time period. The service 214 may then generatea consistency point marker as a next element in the stream of capturedfile operations for the source data. The various Writers 302 of thesource data may then resume normal operations in connection with thesource data.

Referring now to FIG. 6, shown is an example of an embodiment of astream of captured file operations as may be stored, for example, in thekernel cache 210. Included in the example 400 is a list of records. Eachrecord corresponds to either a particular file operation that has beencaptured, as with record 404, or to a consistency point marker, such asrecord 402.. The data included in each captured file operation record404 may include, for example, a particular file name, location, datavalue and the like, in accordance with the particular file operation.The consistency point marker record 402 may be a special record writteninto the stream of file operations. An example of an embodiment of thedata elements that may be included in a consistency point marker record402 are described in more detail in following paragraphs.

Referring now to FIG. 7, shown is an example of elements that may beincluded in a consistency point marker record 402. In the example 402 ofFIG. 7, a consistency point marker record may include a pause forwardingflag 404, a pause update flag 406, an identifier for a source script408, an identifier for a target script 410, and other data 412. In oneembodiment, the fields 404 and 408 may be used in connection withprocessing consistency point markers on the source system, and fields406 and 410 may be used in connection with processing consistency pointmarkers on the target system.

The service 214 on the source system may examine and perform processingin accordance with the particular records in the kernel cache. When arecord corresponds to a consistency point marker, special processing maybe performed using the values of fields 404 and 408. Field 408 mayspecify a script that is executed on the source system when theconsistency point marker record 402 is detected. This script may includeprocessing steps which examine and perform processing in accordance withthe value of the pause forwarding flag 404. The pause forwarding flag404 may be a binary value used in connection with processing the streamof captured file operations on the source system. In one embodiment, ifthe pause forwarding flag 404 is set (e.g., =1), the flag 404 indicatesthat records corresponding to the captured file operations should not besent from the source to the target system causing, for example, abacklog of the file operations to be stored on the source system ratherthan on a target system. If the pause forwarding flag 404 is off (e.g.,=0), it indicates that the records corresponding to the captured fileoperations should be forwarded to the target system from the sourcesystem such that any backlog or build-up of data would occur on thetarget system rather than the source system. The pause forwarding flag404 may be used in processing steps by a source script as may beindicated by field 408. It should be noted that the source script field408 is optional in that a value may unspecified for an instance of therecord 402 in which case no script is executed and the pause forwardingflag value may be ignored. In this case, file operations may beforwarded to the target system in effect as if the flag 402 has a valueof 0. In one embodiment, the flag 404 may cause the records to queue upon the source system for a predetermined time period. After thispredetermined time period, the replication service 214 may then resumeprocessing and forwarding records to the target system. In anotherembodiment, the script may include processing steps causing forwardingof the records to the target system to cease until the occurrence of aparticular event. When this particular event has occurred, the service214 may be signaled to resume processing of the records.

Processing may also be performed on the target system by the service 216for the received records of 400 forwarded from the source system. Upondetection of record corresponding to a consistency point marker, specialprocessing may be performed which may use the values of fields 406 and410. The pause update flag 406 maybe a binary value, having the value of1 when the application of file operations captured should be paused ornot applied to a target copy of the data on the target system.Otherwise, the pause update flag, having a value of 0 indicates that thecaptured file operations as indicated by the transmitted records of 400should be applied to the target copy of the data. Flag 406 may be usedin connection with processing performed by a target system script as maybe indicated by field 410. Field 410 may optionally identify a script tobe executed on the target system. A record for a particular consistencypoint marker may also be unspecified in which case the value of flag 406may be ignored in connection with target system processing. The fields408 and 410 may include identifiers such as, for example, a directorylocation and file name containing a script to be executed on therespective source or target system.

It should be noted that an embodiment may include additional fields thatmay be used in connection with processing the records on the sourceand/or target systems.

In one embodiment, the replication service 214 of the source system maycreate each instance of the record 402 and accordingly initialize thefields of the record 402. As an example, if a backup is to be performedevery day using a copy of the target data, a scheduler process on thesource system may initiate a request to generate a consistency pointmarker at each 24-hour interval. The particular values of thecorresponding consistency point marker record may refer to the scriptsfor this operation. In addition to generation of markers for the backupoperation consistent copy of the source data, additional requests formarker generation may be performed at predetermined time intervals orupon the occurrence of certain events in connection with otheroperations. As such, the service creating the records for theconsistency point markers may accordingly initialize the fields, such asthe script fields 408 and 410, with the particular scripts for therespective operations. Similarly, the service 214 generating the records402 may also initialize other fields of each record instance inaccordance with the particular operation to be performed.

Referring now to FIG. 8, shown is a flowchart 500 of processing stepsthat may be performed in an embodiment in connection with determiningwhen the source data is in a consistent state. The steps of 500summarize the processing described above, for example, in connectionwith FIG. 5. At step 502, a request is made to obtain a consistent copyof the source data. Such a request may be made, for example, inconnection with performing periodic backup operations by a schedulertask, and the like. At step 504, the VSS component communicates with thewriters to the source data to pause new transactions, finish currenttransactions, and flush all cached data to disk. The writers may pauseat this state for a predetermined period of time. VSS then communicateswith the registered provider of the source data that the data is in aconsistent state. In this example, the provider may be, for example, thereplication service 214 of FIG. 4. At step 508, the registered providerwrites a record corresponding to a consistency point marker into thequeue of recorded operations to be processed for forwarding to thetarget system.

Referring now to FIG. 9, shown is a flowchart of processing steps thatmay be performed in an embodiment by a source system in connection withprocessing records from the outgoing queue of records to be forwarded tothe target system. As described elsewhere herein with reference toelements of FIG. 4, these records may be stored in the kernel cacheand/or overflow location. At step 552, the next record in the outgoingqueue is read. At step 554, a determination is made as to whether thisrecord corresponds to a consistency point marker. If not, controlproceeds to step 556 to perform other processing. It should be notedthat the other processing of step 556 may include steps for forwardingthe record to the target system as described elsewhere herein.Additionally, an embodiment may also perform other processing steps inaccordance with the different file operations and other fields includedin each record than as described herein. Control then proceeds to step552 to read the next record. If step 554 evaluates to yes, controlproceeds to step 558 where the fields of the current record are read. Adetermination at step 560 is made as to whether a source script isspecified. If not, control proceeds to step 552 to continue processingwith the next record. If step 560 evaluates to yes indicating that asource script has been specified, the source script is obtained at step561 and execution of the script is performed. Included in this exampleof the source script are statements using the pause forwarding flag andconditionally performing processing steps based on the flag's value. Inother words, in this example, the source script includes statements forperforming steps 562, 564 and 566. At step 562, a determination is madeas to whether the pause forwarding flag is set. If so, control proceedsto step 564 where the source system pauses any further sending ofrecords in the outgoing queue to the target system. If step 562evaluates to no, then no pause is made in connection with forwardingfurther records from the outgoing queue to the target system. Controlproceeds to step 566 where other processing steps may be performed inaccordance with the source script. In this example, the source scriptmay specify that forwarding of records from the outgoing queue may beresumed if previously paused. It should also be noted that variations tothe foregoing may be specified in a script. Control proceeds to step 552with processing of the next record in the outgoing queue. The steps offlowchart 550 may be performed by the replication service 214 of thesource system and a source system script.

Referring now to FIG. 10, shown is a flowchart of processing steps thatmay be performed in an embodiment by a target system in connection withprocessing records received from the source system. At step 602, thenext record in the incoming queue is read. At step 604, a determinationis made as to whether this record corresponds to a consistency pointmarker. If not, control proceeds to step 610 to determine if this is afile operation to be applied to a file which is already in a blockedstate. If not, control proceeds to step 606 to apply the operation tothe target data. After step 606, control proceeds to step 602 to processthe next record in the incoming queue. Otherwise, if the file operationis to be applied to a file which is in a blocked state, control proceedsto step 612 to save the record in a saved block file, such as element220 of FIG. 4. As described elsewhere herein, at some later point whenthe particular file is no longer in a blocked state, the blocked fileoperations may be applied to the target data. After step 612, controlproceeds to step 602 to process the next record in the incoming queue.If step 604 determines that the current record corresponds to aconsistency point marker, control proceeds to step 608 to read thefields of the current record. At step 614, a determination is made as towhether a target script is specified. If not, control proceeds to step602. Otherwise, control proceeds to step 616 to read the script andbegin execution of the script. Included in this example of the targetscript are statements using the pause applying updates flag andconditionally performing processing steps based on the flag's value. Inother words, in this example, the target script includes statements forperforming steps 618, 620 and 624. At step 618, a determination is madeas to whether the pause applying updates flag is set. If not, controlproceeds to step 624. Otherwise, control proceeds to step 620 to pauseprocessing any further records in the incoming queue and thus pauseapplying any further file operations as may be indicated by the records.At step 624, additional processing may be performed in accordance withthe script and the script may cause processing of records in theincoming queue to resume if previously paused at step 620. Processing atstep 624 in accordance with the script may include, for example,performing a backup of the target data or performing a split of thetarget data while the target system is currently pausing the applicationof any further updates to the target data. Subsequently, after thebackup or other operation is performed, the target system may resumeprocessing of records in the incoming queue. It should also be notedthat variations to the foregoing may be specified in a script. Controlproceeds to step 602 to process the next record. The steps of flowchart600 may be performed by the replication service 216 of the target systemand a target system script.

The foregoing describes an embodiment using VSS. However, an embodimentmay use other techniques in connection with coordinating thecommunication between the components accessing a particular data setwithout using VSS. For example, an embodiment may use scripts and/orother programming techniques as an alternative to VSS.

It should be noted that the components of FIG. 4, as modified to includethe functionality for obtaining and utilizing the consistency pointmarkers, may be included in any one of a variety of differentconfigurations. For example, a first configuration may include thecomponents of the source 230 on a host. The replication service 214 ofthe source may reside and execute in a host. The data source, device104, may be local to the host or otherwise connected to the host. Oneconfiguration of the target 240 may include a target host upon which thereplication service 216 resides and executes. The target device 106 maybe local to the target host or otherwise connected to the target host.The foregoing may be characterized as a host-to-host based communicationsystem and the replication services, mirroring drivers and othercomponents used in connection with the techniques described hereinreside on host systems. Some of the components used in connection withthe techniques described herein may also reside and be executed withindata storage systems. For example, an embodiment of the target system240 may include a target data storage system, such as a Symmetrix orClarion data storage system, upon which the replication service 216resides and executes thereon. The source system 230 may include a hostwhich communicates with the target data storage system using appropriatecommunication connections. In another embodiment, the source system 230may include a source host, upon which an application 102 executes andupon which the mirroring driver 206 executes. The source system 230 mayalso include a source data storage system upon which the replicationservice 214 executes and communicates with the source host to obtain therecorded file operations and markers from 210 and 212. The source datastorage system of 230 may communicate with the target 240 (usinghost-based or data storage-based communications as appropriate for thetarget 240) in connection with the techniques described herein. Whatwill now be illustrated are some examples of the differentconfigurations that may be used in connection with the foregoingtechniques.

Referring now to FIG. 11, shown is an example 1000 illustrating aconfiguration in which the components and techniques described hereinmay be used. The example 1000 includes source data storage system 1012connected to two source hosts, 1014 a-b. The two hosts 1014 a-b maycommunicate with target hosts 1014 c-d using communication medium 1020.The two source hosts 1014 a-b may communicate with each other and thesource data storage system 1012 over communication medium 1018 a.Similarly, the target hosts 1014 c-d may communicate with each other andtarget data storage system 1012 a over communication medium 1018 b. Theforegoing may be characterized as a host-to-host based communicationusing the techniques described herein where the components of the source230 and target 240 used to perform the processing steps described hereinreside and execute on respective source and target host systems. Theforegoing example 1000 also illustrates the distributive nature of thedata storage systems as may also exist in an embodiment. The source andtarget data storage systems may be, for example, as described inconnection with FIG. 2.

Referring now to FIG. 12, shown is an example 1100 illustrating anotherconfiguration in which the components and techniques described hereinmay be used. In the example 1100, the source host 1102 communicates witha target data storage system 1012 a. The target data storage system 1012a may include one or more data storage systems, such as Symmetrix orClarion data storage systems, with which the source host 1102communicates over 1020. The source host 1102 may obtain the source datafrom the source data storage system 1012. The source data storage system1012 may be as described, for example, in connection with FIG. 2.

Referring now to FIG. 13, shown is an example 1200 illustrating anotherconfiguration of components that may be included in the source and/ortarget data storage systems. In particular, the example 1200 may be aconfiguration for a target data storage system of FIG. 12. The example1200 includes a component 1050 that may be characterized as a front endprocessor or computer used to facilitate communications to/from one ormore data storage systems 20 a-20 n through a common point. Thecomponent 1050 may be, for example, a server system as may be includedin a SAN (storage area network), or other processor used to facilitatecommunications with components 20 a-20 n. With reference to thecomponents of FIG. 4 as may be modified in accordance with thetechniques described herein using consistency markers, in an embodimentin which the example 1200 is a target system, the replication service216 may reside and be executed on component 1050.

FIGS. 11-13 are some examples selected to illustrate the differentconfigurations that may be included in an embodiment using thetechniques described herein. Other variations will be appreciated bythose of ordinary skill in the art in connection with host-to-host, datastorage-to-host and host-to-data storage based communications using thetechniques described herein.

The foregoing describes a technique for passing consistency pointmarkers from a source system to a target system where data is beingautomatically replicated from source data of the source system to targetdata of the target system. The source data may be a file system-basedelement such as a file, directory, and the like. One embodiment asdescribed above may use Microsoft's VSS (Volume Shadow copy Service)framework to obtain consistent “snapshots” of the source data at definedtime intervals or consistency points, and to write out associatedconsistency point markers in the queue of captured file operations to beapplied to the copy in the target. The different components maycommunicate using a defined application programming interface (API). Thequeue of captured file operations and consistency point markers may becontinually sent to the target system where the captured file operationsare read and applied to the target data copy until a consistency pointmarker is reached. At this point, the target system knows that thetarget data is in a consistent state with respect to a snapshot of thesource data. Before applying any more captured file operations from thequeue to the target data, the target system may use the consistent copyof the source data as reflected in the target data in connection withperforming any one of a variety of tasks, such as a backup operation, aconsistent split operation, and the like. Also described herein is afeature that may be used in controlling the stream of captured fileoperations in the queue as transmitted between the source and targetsystems. A script may be executed on the source system to control theflow rate at which the file operations and consistency point markersincluded in an outgoing queue of the source system are sent to thetarget system (e.g., pause forwarding so as not to send any furtherupdates/writes to the target system). A script may also be executed onthe target system to pause applying any further captured file operationsto the target data. As such, the target data representing a point intime consistent copy of the source data may be frozen in this consistentstate for a length of time so that other operations (e.g., a backupoperation, split operation, etc.) can be performed. Rather than use theVSS framework, an alternative embodiment may use scripts or otherprogramming techniques to control the coordination between allregistered data writers (e.g., a database application), providers (e.g.,data storage systems such as Symmetrix data storage systems), andrequestors in order to obtain a consistent snapshot of the data source,and insert consistency point markers in the queue of records of fileoperations to be applied to the target data as part of data replication.Different components may reside and execute upon hosts and/or datastorage systems as also described herein in accordance with differentconfigurations and others that will be appreciated by those of ordinaryskill in the art.

While the invention has been disclosed in connection with preferredembodiments shown and described in detail, their modifications andimprovements thereon will become readily apparent to those skilled inthe art. Accordingly, the spirit and scope of the present inventionshould be limited only by the following claims.

1. A method for obtaining a copy of source data in a consistent statecomprising: recording one or more file operations having a correspondingtime sequence which modify said source data; receiving a request for acopy of the source data in a consistent state; determining at whichpoint in said corresponding time sequence said source data is in aconsistent state as a result of applying a portion of said fileoperations; marking said point in said corresponding time sequence atwhich said source data is in a consistent state; and applying theportion of file operations determined to place said source data in aconsistent state to said copy of said source data.
 2. The method ofclaim 1, wherein said marking said point in said corresponding timesequence includes: determining a corresponding one of said fileoperations at which said consistent state is obtained.
 3. The method ofclaim 1, further comprising: recording said portion of file operationsin real time as applied to the source data wherein said portion of fileoperations are recorded as a time-ordered series of records, each ofsaid records corresponding to one of said file operations in saidportion; inserting a marker record into said series of recordsindicating when said source data is in a consistent state; and applyingfile operations corresponding to said series of records to said copy ofthe source data until a marker record is determined.
 4. The method ofclaim 3, further comprising: pausing further application of recordedfile operations to said copy of the source data when performing anoperation using said copy of said source data while allowing newincoming file operations to be applied to said source data.
 5. Themethod of claim 4, wherein said pausing further application of recordedfile operations to said copy of the source data is controlled byexecuting a target script on a target system including said copy of saidsource data.
 6. The method of claim 5, further comprising: transmittingrecords corresponding to said portion of file operations to said targetsystem wherein said transmitting of said records is paused in accordancewith processing on a source system including said source data.
 7. Themethod of claim 6, further comprising: executing a source script on saidsource system to control transmission of said records from said sourcesystem to said target system.
 8. The method of claim 7, wherein saidmarker record includes an indicator corresponding to at least one of:said source script, said target script, a first flag value indicatingwhether to pause application of said portion of data operations to saidcopy of said source data, a second flag value indicating whether topause transmission of said records from said source system to saidtarget system.
 9. The method of claim 4, wherein said operationperformed using said copy of said source data is a backup operation ofsaid source data.
 10. The method of claim 1, wherein said request for acopy of the source data in a consistent state is made using anapplication programming interface to a component which coordinatesaccess to said source data, and, in response to receiving said request,said component: causes a writer of said source data to complete existingtransactions without starting any new transactions and to flush datacache stores of said source data, said writer pausing commencing of newtransactions in accordance with at least one criteria; communicates witha registered element that said source data is in a consistent state; andcommunicates with said writer to resume new transactions in accordancewith said at least one criteria.
 11. The method of claim 1, wherein saidregistered element performs said marking said point in saidcorresponding time sequence by inserting a marker record into atime-ordered series of records corresponding to said portion of fileoperations, and the method further comprising: transmitting said recordsfrom a source location including said source data to a target locationincluding said copy of said source data wherein said portion of fileoperations are applied to said copy independent of whether additionaldata operations are applied to said source data.
 12. A computer programproduct for obtaining a copy of source data in a consistent statecomprising code that: records one or more file-operations having acorresponding time sequence which modify said source data; receives arequest for a copy of the source data in a consistent state; determinesat which point in said corresponding time sequence said source data isin a consistent state as a result of applying a portion of said fileoperations; marks said point in said corresponding time sequence atwhich said source data is in a consistent state; and applies the portionof file operations determined to place said source data in a consistentstate to said copy of said source data.
 13. The computer program productof claim 12, wherein said code that marks said point in saidcorresponding time sequence includes code that: determines acorresponding one of said file operations at which said consistent stateis obtained.
 14. The computer program product of claim 1, furthercomprising code that: records said portion of file operations in realtime as applied to the source data wherein said portion of fileoperations are recorded as a time-ordered series of records, each ofsaid records corresponding to one of said file operations in saidportion; inserts a marker record into said series of records indicatingwhen said source data is in a consistent state; and applies fileoperations corresponding to said series of records to said copy of thesource data until a marker record is determined.
 15. The computerprogram product of claim 14, further comprising code that: pausesfurther application of recorded file operations to said copy of thesource data when performing an operation using said copy of said sourcedata while allowing new incoming file operations to be applied to saidsource data.
 16. The computer program product of claim 15, wherein saidcode that pauses further application of recorded file operations to saidcopy of the source data is controlled by executing a target script on atarget system including said copy of said source data.
 17. The computerprogram product of claim 16, further comprising code that: transmitsrecords corresponding to said portion of file operations to said targetsystem wherein said transmitting of said records is paused in accordancewith processing on a source system including said source data.
 18. Thecomputer program product of claim 17, further comprising code that:executes a source script on said source system to control transmissionof said records from said source system to said target system, whereinsaid marker record includes an indicator corresponding to at least oneof: said source script, said target script, a first flag valueindicating whether to pause application of said portion of dataoperations to said copy of said source data, a second flag valueindicating whether to pause transmission of said records from saidsource system to said target system.
 19. A system comprising: a firstdata storage system including a data source; a second data storagesystem including a data target; first code included in a first hostconnected to said first data storage system which: records one or morefile operations having a corresponding time sequence which modify saidsource data; receives a request for a copy of the source data in aconsistent state; determines at which point in said corresponding timesequence said source data is in a consistent state as a result ofapplying a portion of said file operations; and marks said point in saidcorresponding time sequence at which said source data is in a consistentstate; and second code included in a second host connected to saidsecond data storage system which: applies the portion of file operationsdetermined to place said source data in a consistent state to said copyof said source data.
 20. The system of claim 19, wherein said requestfor a copy of the source data in a consistent state is made, using anapplication programming interface included in said first host, to acomponent included in said first host which coordinates access to saidsource data, and said component further comprising third code which:causes a writer of said source data to complete existing transactionswithout starting any new transactions and to flush data cache stores ofsaid source data, said writer pausing commencing of new transactions inaccordance with at least one criteria; communicates with a registeredelement that said source data is in a consistent state; and communicateswith said writer to resume new transactions in accordance with said atleast one criteria.